Back

npj Precision Oncology

Springer Science and Business Media LLC

Preprints posted in the last 90 days, ranked by how well they match npj Precision Oncology's content profile, based on 48 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.

1
Artificial intelligence-driven virtual tumorboard enhances precision care in myelodysplasticsyndromes

Swoboda, D. M.; DeZern, A. E.; England, J. T.; Venugopal, S.; Kehoe, T.; Aubrey, B. J.; Raddi, M. G.; Consagra, A.; Wang, J.; Andreadakis, J.; Rivero, G.; Stahl, M.; Zeidan, A. M.; Haferlach, T.; Brunner, A. M.; Buckstein, R.; Santini, V.; Della Porta, M. G.; Sekeres, M. A.; Nazha, A.

2026-03-27 hematology 10.64898/2026.03.26.26349088 medRxiv
Top 0.1%
15.2%
Show abstract

Background: Large language models (LLMs) perform well on standardized medical exam questions, but their reliability for complex hematology decision making is uncertain. We compared four general-purpose LLMs (GPT-4o, GPT-o3, Claude Sonnet 4, and DeepSeek-V3) with a Virtual MDS Panel (VMP), a coordinated multi-agent AI system in which domain-specialized, rule-bound software agents (WHO/ICC guidelines; IPSS-R/IPSS-M; NCCN) collaborate to generate tumor-board-level recommendations. Methods: Each model generated diagnostic, prognostic, and treatment recommendations for 30 myelodysplastic syndrome cases. Nine international MDS experts from five institutions, blinded to model identity, completed 3,000 structured ratings using 5-point Likert scales for diagnosis, prognosis, and therapy and classified errors by severity. Results: General-purpose LLMs achieved modest expert ratings (overall mean scores: 3.7 for GPT-o3, 3.2 for GPT-4o, 3.1 for DeepSeek, and 3.0 for Claude) and contained major factual errors in at least 24% of responses. The VMP increased the proportion of outputs rated 4 or higher to 87% (vs. 34-66% for general-purpose models), improved mean scores to 4.3 overall (4.3 for diagnosis, 4.4 for prognosis, and 4.1 for therapy), and reduced major errors to 8%. Conclusions: In this blinded evaluation of 30 complex MDS cases, general-purpose LLMs produced clinically important errors at rates that raise safety concerns for autonomous hematology decision making. The VMP, a rule-bound, multi-agent architecture, approached expert-level accuracy supporting its potential role as an effective decision-support tool for MDS in the future.

2
Beyond Binary MRD: Quantitative ctDNA Interpretation After Curative-Intent Surgery for Colorectal Cancer

Kim, J.; Ye, S.; Kwak, J.-M.; Choi, D.; Kim, S.; Jeong, H. J.; Hong, E.; Lee, J. W.; Kim, S.; Won, Y.-H.; Koo, S. S.; Lee, I. S.; Park, T.; Yoon, J. B.; Oh, H.; Lee, Y. J.; Ahn, S.-J.; Kim, J.-S.; Kim, H.-K.; Cho, H.-W.; Lee, S.; Hong, J.; Razavi, P.; Kim, J.; Hur, J. W.

2026-03-10 oncology 10.64898/2026.03.09.26347910 medRxiv
Top 0.1%
13.8%
Show abstract

BackgroundCirculating tumor DNA (ctDNA) detection after curative-intent surgery is being used to identify minimal residual disease (MRD) in colorectal cancer (CRC). However, MRD classification is dependent on analytical sensitivity, and the impact of detection threshold on observed post-operative positivity remains incompletely characterized. We evaluated MRD positivity in stage I-III CRC using a CRISPR-based plasma sequencing assay, MUTE-Seq. MethodsPatients were prospectively enrolled and analyzed using customized tumor-informed panels applied to baseline and post-operative plasma samples collected at 4-week and 3-month. We report preliminary results from 39 plasma samples obtained from the first 14 patients. MRD positivity was assessed across multiple hypothetical detection thresholds (1-100 ppm). ResultsAll 14 patients (100%) had detectable mutations at baseline. Mutation-positive call number significantly decreased after surgery (baseline vs 4-week, p = 0.006; baseline vs 3-month, p = 0.004), and ctDNA concentration likewise declined (baseline vs 4-week, p = 0.002; baseline vs 3-month, p = 0.003). Among stage II-III patients, MRD positivity at 4-week was 20% at a 100-ppm threshold but increased to 70% at 10 ppm and 100% at 1 ppm. At 3-month, MRD positivity was 11% at a 100-ppm threshold and 78% at 1 ppm. At both time points, approximately 80% of MRD-positive stage II-III patients harbored ctDNA levels below 100 ppm, and half of these cases were below 15 ppm. Two patients (one stage I and one stage II) developed recurrence; both were MRD-positive at 4-week and demonstrated increasing mutation-positive calls at 3-month, with a median radiologic lead time of 4 months. ConclusionsPost-operative MRD classification in CRC is strongly influenced by analytical sensitivity. A substantial proportion of residual disease signals reside below the conventional ctDNA detection threshold of 100 ppm, supporting the clinical relevance of ultrasensitive ctDNA detection.

3
DNA methylation variability defines a fundamental dimension of tumor epigenomes linked to genomic instability, tumor aggressiveness, and clinical outcomes

Bukovec, D.; Gjorgjioski, B.; Misheva, M. S.; Kungulovski, G.

2026-03-14 cancer biology 10.64898/2026.03.12.711303 medRxiv
Top 0.1%
12.0%
Show abstract

BackgroundTumors exhibit substantial cellular and molecular diversity driven by genetic and epigenetic mechanisms. Large-scale profiling efforts have established aberrant DNA methylation as a universal hallmark of cancer. Beyond changes in mean methylation levels, tumor tissues exhibit elevated DNA methylation variability at specific genomic regions within and across tumors. This constitutes a fundamental dimension of cancer epigenomes, reflecting disrupted maintenance of epigenomic states and stochastic drift, which may enable adaptation to the microenvironment, phenotypic plasticity, invasion, disease progression, and treatment resistance. However, the genome-wide organization and functional consequences of DNA methylation variability across cancer types remain incompletely understood. MethodsWe analyzed paired tumor-normal DNA methylation profiles across 16 cancer types to systematically quantify DNA methylation variability. Pan-cancer DNA methylation variability was consistently observed using complementary statistical approaches and multiple modes of data representation. We identified cancer-specific and pan-cancer differentially variable regions and evaluated their associations with genomic features, transcriptional and chromatin regulators, and biological processes. Variability was quantified using three measures per sample: the proportion of intermediately methylated sites (PIM), genome-wide Shannon entropy, and a DNA methylation-based stemness index. Associations with genomic instability, tumor biological features, and clinical outcomes were subsequently assessed. ResultsTumor samples consistently exhibited higher DNA methylation variability than matched normal tissues, reflected by increased dispersion and wider interquartile ranges. Pan-cancer variably methylated regions were depleted in promoters and enriched in open sea regions, in heterochromatic H3K27me3-decorated PRC2-repressed domains, and at enhancers. They preferentially contained motifs for transcription factors involved in developmental regulation. Elevated DNA methylation variability, captured by higher PIM, entropy, and stemness scores, was associated with increased genomic instability manifested by higher aneuploidy, increased DNA break points, a greater fraction of the genome altered, and increased tumor mutational burden, as well as with aggressive tumor features such as lymph node involvement, post-therapy neoplasm events, and elevated hypoxia scores. Importantly, tumors with high DNA methylation variability exhibited significantly worse overall, progression-free, and disease-free survival. ConclusionsDNA methylation variability is a pervasive and clinically relevant feature of tumor epigenomes, reflecting epigenetic and genetic instability, expanded regulatory plasticity, and tumor aggressiveness.

4
Novel polymeric fluoropyrimidine CF10 demonstrates superior therapeutic index and survival advantage in patient-derived models of 5-fluorouracil-refractory colorectal cancer

Sah, N.; Omy, T. R.; Kairamkonda, S.; Acharya, G.; Palle, H.; Luna, P.; Mani, C.; Gmeiner, W.; Cheedella, N.; Reedy, M.; Palle, K.

2026-04-08 cancer biology 10.64898/2026.04.05.716582 medRxiv
Top 0.1%
10.6%
Show abstract

BackgroundFluoropyrimidines, specifically 5-fluorouracil (5-FU), remain the cornerstone of colorectal cancer (CRC) therapy. However, intrinsic and acquired resistance, alongside dose-limiting systemic toxicities, often result in treatment failure and disease relapse. There is a pressing clinical need for next-generation fluoropyrimidines that can retain the antitumor activity in 5-FU-refractory CRC models while maintaining a favorable safety profile. MethodsWe evaluated the antitumor efficacy of CF10, a novel polymeric fluoropyrimidine designed for the sustained delivery of FdUMP, against equimolar 5-FU. We utilized a diverse panel of six patient-derived CRC organoid (PDO) models to assess 3D growth inhibition under both normoxic ([~]20% O2) and physioxic (5% O2) conditions. Mechanisms of action were investigated via {gamma}H2AX signaling (DNA damage), Annexin V/PI flow cytometry (death kinetics), and ALDEFLUOR assays (stem-like populations). Functional suppression of metastasis-associated phenotypes was evaluated using 3D Matrigel invasion assays. Finally, the therapeutic index and overall survival were validated in vivo using two independent patient-cell-derived xenograft (PCDX) models (TX-CC-199 and TX-CC-201). ResultsCF10 demonstrated significantly greater suppression of organoid growth compared to equimolar 5-FU across all patient-derived lines, regardless of morphological heterogeneity or oxygen tension. In 3D invasion assays, CF10 achieved superior anti-invasive activity even at a 10-fold lower molar dose than 5-FU. This functional advantage was mirrored by a marked depletion of the ALDH-high stem-like subpopulation, which was largely recalcitrant to 5-FU. Mechanistically, CF10 induced intensified replication stress, DNA damage and repair signaling ({gamma}H2AX, Top1cc/pRPA32, FANCD2), and pushed the CRC to irreversible/terminal, PI-positive death states. In vivo, CF10 treatment resulted in profound tumor growth inhibition and a robust survival advantage in two patient cell-derived xenograft (PCDX) models (Log-rank P<0.01) without inducing systemic weight loss or noticeable toxicity. ConclusionsBy integrating 3D patient-derived modeling with in vivo validation, we demonstrate that CF10 effectively overcomes the biological and pharmacological limitations of 5-FU. CF10 targets the aggressive, invasive, and stem-like subpopulations of CRC that drive clinical relapses. These findings provide a compelling translational rationale for the clinical development of CF10 as a superior alternative to standard fluoropyrimidines in both treatment-naive and refractory CRC. Significance StatementDespite the foundational role of 5-fluorouracil (5-FU) in colorectal cancer (CRC) therapy, resistance and systemic toxicity remain major barriers to curative outcomes. This study identifies CF10, a novel polymeric fluoropyrimidine, as a superior alternative that overcomes 5-FU resistance in biologically diverse patient-derived organoids and xenograft models. Crucially, CF10 demonstrates a unique capacity to suppress the invasive, aldehyde dehydrogenase (ALDH)-high stem-like subpopulations that likely survive standard chemotherapy (5-FU) by maintaining efficacy under physiological oxygen levels and providing a significant survival advantage in vivo with improved tolerability. CF10 represents a promising translational candidate for the treatment of both treatment-naive and refractory CRC.

5
DNA methylation signatures of mismatch repair-deficient colorectal cancer

Ward, R.; Endicott, M.; Mallabar-Rimmer, B.; Burrage, J.; Sherwood, K.; Huang, Q.; Ward, J. C.; Thorn, S.; Woolley, C.; Wood, S.; Dempster, E.; Green, H. D.; Tomlinson, I.; Webster, A. P.

2026-04-13 cancer biology 10.64898/2026.04.09.717165 medRxiv
Top 0.1%
10.2%
Show abstract

BackgroundColorectal cancer (CRC) is a molecularly heterogeneous disease shaped by both genetic and epigenetic alterations. Approximately 15% of CRCs display widespread CpG island hypermethylation, known as the CpG Island Methylator Phenotype (CIMP). CIMP-high (CIMP-H) tumours frequently exhibit MLH1 promoter hypermethylation, leading to mismatch repair deficiency (MMRd) and microsatellite instability (MSI). However, DNA methylation patterns associated with MSI, independent of CIMP and MLH1 silencing, and the influence of clinical variables such as anatomical location and patient age on the CRC methylome remain poorly characterised. MethodsWe performed epigenome-wide DNA methylation profiling of 259 primary CRC tissue samples using the Illumina EPICv2 array, comparing differential methylation between MSI and microsatellite stable (MSS) CRC, adjusting for tumour purity, MLH1 promoter methylation, CIMP status, and anatomical location, to account for known confounders. We further evaluated the independent effects of anatomical location and patient age on global methylation patterns. ResultsEpigenome-wide differential methylation between MSS and MSI CRC was dominated by MLH1 promoter hypermethylation. After adjusting for MLH1 hypermethylation and CIMP status, we identified a distinct set of 656 CpG sites associated with MMRd independent of MLH1 silencing. These included hypermethylation at LRP6, GSK3{beta}, and CDK12, implicating altered WNT signalling and transcriptional regulation pathways. Comparison of MSI subgroups revealed the co-occurrence of MLH1 hypermethylation with promoter hypermethylation at TXNRD1. Anatomical location showed a strong independent effect on methylation patterns, while we observed only modest effects of patient age on the CRC methylome after adjustment for confounders. ConclusionsWe identified a distinct methylation profile distinguishing MSS and MSI CRC, including MLH1-independent markers of MMRd, as well as novel differentially methylated loci within MSI subgroups. We further showed that anatomical location has a strong independent impact on the CRC methylome. Together, these findings refine the molecular characterisation of CRC and highlight potential epigenetic markers that could inform patient stratification and precision oncology.

6
Attention-Enhanced U-Net Segmentation for Reliable Detection of Circulating Tumor-Associated Cells.

Cristofanilli, M.; Limaye, S.; Rohatgi, N.; Crook, T.; Al-Shamsi, H.; Gaya, A.; Page, R.; Shreeniwas, A.; Patil, D.; Datta, V.; Akolkar, D.; Schuster, S.; Agrawal, P.; Patel, S.; Shejwalkar, P.; Golar, S.; Srinivasan, A.; Datar, R.

2026-03-09 oncology 10.64898/2026.03.07.26347846 medRxiv
Top 0.1%
10.0%
Show abstract

BackgroundCirculating tumor associated cell (CTAC) detection-based multi-cancer early detection (MCED) strategies may be hindered by the rarity of CTACs among millions of peripheral blood nucleated cells (PBNCs). We developed an advanced U-Net-based encoder-decoder model for pixel-level CTAC discrimination that integrates attention-gated skip connections to preserve morphological and fluorescence details. MethodsModel suitability was explored in an initial cohort of asymptomatic individuals (n = 428) and patients with advanced solid tumors (n = 354). A case-control study assessed clinical performance in therapy-naive stage I/II cancer patients (n = 185), individuals with benign conditions (n = 129), and asymptomatic individuals (n = 111). The model was then validated across four prospective studies on distinct populations: recurrent cancer cases with low tumor burden (n = 224); patients with solid tumors in the peri-operative setting (n = 17); suspected cancer cases (n = 259); and asymptomatic individuals (n = 7,183), respectively. All studies used blinded peripheral blood specimens from which PBNCs were isolated, stained for EpCAM / Hoechst 33342, and imaged. Ground truth annotations were established via pathologist review. The U-Net pipeline encoded spatial information in the images via convolutional and pooling layers and generated pixel-wise segmentation masks to identify CTACs. In all studies, sensitivity was based on CTAC detection rate in cancer specimens and CTAC undetectability rate in specimens from healthy asymptomatic individuals or those with benign conditions ResultsIn the exploratory study, the model had 90.68% (95% CI: 87.16%, 93.50%) sensitivity and 99.53% (95% CI: 98.32%, 99.94%) specificity. In the case-control cohort, the model had 88.65% sensitivity (95% CI: 83.17%, 92.83%), 78.95% (95% CI: 71.03%, 85.53%) specificity in benign conditions, and >99.9% specificity in asymptomatic individuals. Among the four prospective studies, the model had: (a) 91.96% (95% CI: 87.60%, 95.17%) sensitivity in pretreated patients with low tumor burden; (b) 100% sensitivity in pre-surgery specimens, and 29.41% sensitivity in post-surgery specimens; (c) 96.34% PPV (95% CI: 93.22%, 98.05%) and a 32.35% NPV (95% CI: 25.58%, 39.95%) for diagnostic triaging; and, (d)11% PPV (95% CI: 31.72%, 53.24%) and 99.97% NPV (95% CI: 99.90%, 99.99%) for MCED in healthy asymptomatic individuals. ConclusionsThe attention-enhanced U-Net achieved robust, generalizable performance for CTAC-detection in case-control and prospective cohorts, supporting its clinical utility for accurate cancer detection.

7
Quantifying Treatment Resistance in Mixtures of Gastrointestinal Stromal Tumor Cells with BARMIX

Darbalaei, M.; Muhlenberg, T.; Zummack, J.; Dujardin, P.; Grunewald, S.; Baginska, A.; Munteanu, P.; Martinez Cruz, M.; Dorsch, M.; Schramm, A.; Bauer, S.; Hoffmann, D.; Gruner, B. M.

2026-03-25 systems biology 10.64898/2026.03.23.713602 medRxiv
Top 0.1%
9.9%
Show abstract

Targeted therapies in gastrointestinal stromal tumors (GIST) often fail due to heterogeneous resistance mutations arising across metastatic sites. Efficient, rational design of mutation-specific therapies requires the ability to quantify treatment resistance across many genotypes in parallel. Here, we present BARcode MIXture analysis (BARMIX), a platform combining multiplexed experiments with DNA-barcoded cancer cell mixtures in vitro and in vivo, and a probabilistic framework for quantitative assessment of genotype-specific treatment resistance. BARMIX efficiently and accurately recapitulated known clinical resistance patterns in GIST and matched resistance measurements from individual cell lines in vitro and in vivo. This experimental-computational approach provides a scalable and broadly applicable strategy for quantifying treatment responses in complex cell populations, enabling systematic preclinical testing of new drugs and combinations to identify mutation-specific therapeutic options for precision oncology in GIST and beyond.

8
Prediction of Mutations and Outcome in Gastrointestinal Stromal Tumors with Deep Learning: A Multicenter, Multinational Study

Bonetti, A.; Le, V.-L.; Carrero, Z. I.; Wolf, F.; Gustav, M.; Lam, S. W.; Vanhersecke, L.; Sobczuk, P.; LE LOARER, F.; Lenarcik, M.; Rutkowski, P.; van Sabben, J. M.; Steeghs, N.; van Boven, H.; Machado, I.; Bague, S.; Navarro, S.; Medina-Ceballos, E.; Agra, C.; Giner, F.; Tapia, G.; Hernandez Gallego, A.; Civantos Jubera, G.; Cuatrecasas, M.; Lopez-Prades, S.; Perret, R. E.; Soubeyran, I.; Khalifa, E.; Blouin, L.; Wardelmann, E.; Meurgey, A.; Collini, P.; Voloshin, A.; Yatabe, Y.; Hirano, H.; Gronchi, A.; Nishida, T.; Bouche, O.; Emile, J.-F.; NGO, C.; Hohenberger, P.; Cotarelo, C.; Jakob, J.

2026-02-03 oncology 10.64898/2026.02.02.26345350 medRxiv
Top 0.1%
8.5%
Show abstract

BackgroundGastrointestinal stromal tumor (GIST) is the most common gastrointestinal mesenchymal tumor, driven by tyrosine-protein kinase KIT and platelet-derived growth factor receptor A (PDGFRA) mutations. Specific variants, such as KIT exon 11 deletions, carry prognostic and therapeutic implications, whereas wild-type (WT) variants derive limited benefit from tyrosine kinase inhibitors (TKIs). Given the limited reproducibility of established clinicopathological risk models, deep learning (DL) applied to whole-slide images (WSIs) emerged as a promising tool for molecular classification and prognostic assessment. Patients and methodsWe analyzed 8398 GIST cases from 21 centers in 7 countries, including 7238 with molecular data and 2638 with clinical follow-up. DL models were trained on WSIs to predict mutations, treatment sensitivity, and recurrence-free survival (RFS). ResultsDL predicted mutational status in GIST from WSIs, with area under the curve (AUC) of 0.87 for KIT, 0.96 for PDGFRA. High performance was observed for subtypes, including KIT exon 11 delinss 557-558 (0.67) and PDGFRA exon 18 D842V (0.93). For therapeutic categories, performance reached 0.84 for avapritinib sensitivity, 0.81 for imatinib sensitivity. DL models predicted RFS, with hazard-ratios (HR) of 8.44 (95%CI 6.14-11.61) in the overall cohort and 4.74 (95%CI 3.34-6.74) in patients receiving adjuvant therapy. Prognostic performance was comparable to pathology-based scores, with highest discrimination in the overall cohort and in patients without adjuvant therapy (9.44, 95%CI (5.87-15.20)). ConclusionDL applied to WSIs enables prediction of molecular alterations, treatment sensitivity, and RFS in GIST, performing comparably to established risk scores across international cohorts, providing a baseline for future multimodal predictors. HighlightsO_LIDeep learning on histology predicts KIT and PDGFRA mutations in a large international cohort of GISTs from multiple centers C_LIO_LIWhole-slide image models stratify recurrence-free survival comparable to pathology-based risk scores C_LIO_LIPrognostic value of deep learning is preserved in adjuvant therapy subgroups, supporting treatment duration decisions C_LI O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=117 SRC="FIGDIR/small/26345350v1_ufig1.gif" ALT="Figure 1"> View larger version (36K): org.highwire.dtl.DTLVardef@652548org.highwire.dtl.DTLVardef@729a2borg.highwire.dtl.DTLVardef@1e7b6b9org.highwire.dtl.DTLVardef@18d6721_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical abstract.C_FLOATNO Overview of study design and dataset characteristics. (A) Multinational collection of WSIs from seven countries (Spain, France, Italy, Germany, the Netherlands, Poland, and Japan), followed by standard image preprocessing with the STAMP pipeline and clinical data preprocessing/standardization via the Grammar Data Curation framework. The workflow was divided into two main branches: (i) molecular mutation and treatment sensitivity prediction, and (ii) RFS prediction. Model performance was evaluated using AUROC and F1 score for classification tasks, and Kaplan-Meier survival curves with hazard ratios for RFS. Model explainability was assessed through heatmaps of WSIs and identification of top predictive tiles. (B) Summary of clinical dataset composition: proportion of cases receiving adjuvant therapy, tumor location distribution, mutation distribution at the exon level, and mutation distribution at the codon level. C_FIG

9
Pan-cancer survival modeling reveals structural limits of genomic feature integration in immunotherapy outcomes

Hassan, W.; Adeleke, S.

2026-04-18 bioinformatics 10.64898/2026.04.15.718634 medRxiv
Top 0.1%
8.3%
Show abstract

BackgroundImmune checkpoint inhibitors (ICIs) have improved outcomes across multiple cancer types, yet reliable predictors of survival remain limited. While genomic features such as tumor mutational burden (TMB) are widely used, their contribution to predictive modeling in heterogeneous real-world cohorts remains unclear. We evaluated the relative contributions of clinical and whole-genome sequencing (WGS) features in pan-cancer survival modeling. MethodsWe analyzed 658 patients treated with ICIs with matched WGS data from the Genomics England. Using a leakage-controlled machine learning framework with strict train-test separation, we compared four models: TMB-only, clinical-only, clinical+TMB, and an integrated 11-feature clinico-genomic XGBoost survival model. Model performance was assessed using Harrells concordance index (C-index) with bootstrap confidence intervals. ResultsTMB alone demonstrated near-random discrimination (C-index 0.50; 95% CI 0.44-0.56). Clinical variables substantially improved predictive performance (0.59; 95% CI 0.53-0.64), with marginal gain from adding TMB (0.59). The integrated model achieved a C-index of 0.60 (95% CI 0.55-0.65). While improvement over TMB alone was significant, incremental gain beyond optimized clinical models was modest. Feature attribution analysis showed that model performance was dominated by clinical variables, with genomic features contributing limited additional signal. ConclusionsThese findings suggest that, in heterogeneous pan-cancer cohorts, predictive performance is constrained by the underlying data structure, in which dominant clinical signals overshadow genome-scale features. This study highlights fundamental limitations in integrating genomic data into survival models across diverse cancer types and provides a benchmark for future computational approaches.

10
A network-based deep learning model integrating subclonal architecture for therapy response prediction in cancer

Kim, S.; Ha, D.; Nam, A.-r.; Cheong, S.; Lee, J.; Kim, S.; Park, S.

2026-03-17 cancer biology 10.64898/2026.03.14.711567 medRxiv
Top 0.1%
8.2%
Show abstract

Predicting treatment response remains challenging in oncology, particularly given the growing diversity of therapeutic options. Despite efforts using gene expression signatures, or integrative multi-omics frameworks, robust and interpretable biomarkers remain limited. We present SubNetDL, a deep learning framework that integrates subclonal mutation profiles and protein-protein interaction networks via network propagation. Unlike condition-specific approaches, SubNetDL leverages somatic mutations alone and is applicable across diverse cancer types and treatment modalities. Applied to ten TCGA cancer-drug combinations, SubNetDL achieved consistently strong performance (median AUROC = 0.74) and successfully generalized to two independent immunotherapy datasets (median AUROC = 0.77). Importantly, it identified candidate biomarker genes with treatment-specific relevance. SubNetDL prioritized genes that were not central in the network, highlighting its ability to capture context-specific patterns beyond traditional metrics. In conclusion, our approach offers a robust and interpretable framework for identifying predictive biomarkers and stratifying patients based on mutation profiles and network context. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=200 SRC="FIGDIR/small/711567v1_ufig1.gif" ALT="Figure 1"> View larger version (55K): org.highwire.dtl.DTLVardef@d6605org.highwire.dtl.DTLVardef@1a50594org.highwire.dtl.DTLVardef@1114deeorg.highwire.dtl.DTLVardef@1137504_HPS_FORMAT_FIGEXP M_FIG C_FIG MotivationIntratumoral heterogeneity is a fundamental driver of therapeutic resistance, yet most predictive models rely on aggregate mutational burdens or static gene expression signatures, overlooking the subclonal dynamics that shape treatment outcomes. While network biology offers a functional lens to interpret genomic alterations, a framework that explicitly bridges subclonal architecture with system-level molecular interactions has been lacking. To address this, we developed SubNetDL, a deep learning framework that integrates patient-specific subclonal profiles with protein-protein interaction networks. By leveraging only somatic mutation data, SubNetDL captures the functional convergence of subclonal evolution, providing a robust and interpretable platform for patient stratification and biomarker discovery across diverse oncological contexts.

11
Consensus Through Diversity: A Comprehensive Benchmark of Multi-Omic Approaches for Precision Breast Oncology

Sionakidis, A.; Pinilla Alba, K.; Abraham, J.; Simidjievski, N.

2026-04-21 bioinformatics 10.64898/2026.04.17.719159 medRxiv
Top 0.1%
7.2%
Show abstract

Emerging multi-omic profiling has made it feasible to subtype disease using multiple molecular layers. However, inconsistent preprocessing, heterogeneous implementations, variable evaluation, and limited reproducibility often constrain method selection. Here, we systematically benchmark 22 publicly available unsupervised approaches for bulk data on the TCGA-BRCA cohort across five modalities (RNA-seq, miRNA, DNA methylation, copy numbers, single nucleotide polymorphisms) and validate findings in two independent datasets, enabling a multi-layered comparison of performance, heterogeneous data support and interpretability. Most approaches fuse multi-omic data to produce a two-cluster solution largely aligned with ER status, with higher-resolution approaches further refining these into four coherent subclasses (angiogenic luminal, oxidative-phosphorylation/HER2-low luminal, immune-inflamed basal-like, and hyper-proliferative basal-like). Our benchmarking results indicate that methods based on similarity networks can efficiently produce stable, reliable partitions. Matrix factorisation and Bayesian factorisation algorithms produce rich latent representations, allowing quantification of feature and modality contributions, albeit at higher computational cost. Consensus clustering can be used on a case-by-case basis and refine partitions into more robust and generalisable findings. We aggregate our insights into a decision workflow that aligns with study goals, data characteristics, and computational resources, enabling optimal analytic strategies. This comprehensive assessment provides a practical roadmap for investigators seeking to extract reproducible, biologically meaningful subtypes from complex multi-omic datasets. We higlight the different technical and practical benefits and trade-offs that shape the selection and development of multi-omic approaches applied in precision oncology.

12
Distinct Spatial Programs of Response versus Resistance in Non-Small Cell Lung Cancer after Neoadjuvant Chemoimmunotherapy

Park, S. H.; Koh, J.; Bae, S.; Choi, H.; Yun, T.; Park, J. H.; Na, B.; Park, S.; Lee, H. J.; Park, I. K.; Kang, C. H.; Kim, Y. T.; Na, K. J.

2026-04-07 cancer biology 10.64898/2026.04.05.716543 medRxiv
Top 0.1%
7.2%
Show abstract

BackgroundNeoadjuvant chemoimmunotherapy (nCIT) has become a standard treatment for locally advanced resectable non-small cell lung cancer (NSCLC), yet the spatial biology underlying treatment resistance remains poorly understood. We used spatial transcriptomics to define the microenvironmental architecture of residual cancers in patients who did not achieve major pathologic response (non-MPR) compared with those who did (MPR). MethodsSpatial transcriptomics was performed on 10 formalin-fixed paraffin-embedded (FFPE) tumor blocks (5 MPR, 5 non-MPR) obtained from 8 patients treated with nCIT. A deep learning algorithm was applied to detect viable residual cancer spots from treatment-induced fibrosis and necrosis. Spatial deconvolution, distance modeling, ligand-receptor analysis, and functional pathway scoring were integrated to characterize niche-specific programs. ResultsMPR cancer core displayed an immune-permissive remodeling environment with deep infiltration of cytotoxic CD8+ T cells, mature dendritic cells (LAMP3+, CCR7+), and active efferocytosis signaling (APOE-TREM2), alongside robust MHC class II expression. Non-MPR cancer core, by contrast, exhibited spatial immune exclusion: a dense fibroblast barrier reinforced by TIMP1-CD63 signaling and Treg-enriched boundaries physically restricted effector T cell access to the cancer core. Residual cancer cells in non-MPR samples maintained active cell cycling and independently upregulated cytochrome P450-mediated drug detoxification and DNA damage response pathways without inducing MHC class II expression -- effectively decoupling intrinsic survival from immune recognition. The non-MPR core also showed a hyper-metabolic profile, including elevated glutathione metabolism consistent with antioxidant buffering against chemotherapy-induced oxidative stress. TROP2 was broadly expressed across the non-MPR cancer core and co-localized with DNA damage response and nuclear factor erythroid 2-related factor 2 resistance signatures. ConclusionsResidual cancer cores in non-MPR tumors appear to represent evolved resistant niches sustained by structural immune exclusion, metabolic rewiring, and DNA repair proficiency. These findings highlight the spatial co-localization of epithelial anchors, such as TROP2, with intrinsic resistance pathways, providing a structural rationale for developing novel precision therapeutic strategies to bypass stromal barriers and overcome the cancer cores intrinsic repair capacity.

13
Ex vivo drug testing in metastatic biopsies reveals patient-specific vulnerabilities to cancer targeting and immune activating drugs

Woehrl, L.; Das, D.; Weidele, K.; Treitschke, S.; Baron, C.; Halbritter, D.; Botteron, C.; Lueke, F.; Stojanovic Guzvic, N.; Werner-Klein, M.; Harrer, D. C.; Pukrop, T.; Lanznaster, J.; Nitsch, T.; Suedhoff, T.; Fischer, N.; Kubuschok, B.; Claus, R.; Benz, M.; Bruns, V.; Hoffmann, M.; Stutz, A.; Klein, C. A.; Werno, C.

2026-02-09 cancer biology 10.64898/2026.02.06.704037 medRxiv
Top 0.1%
6.8%
Show abstract

Biomarker-guided therapies in oncology often fail to induce considerable responses in patients with advanced cancer. As a complementary approach, direct drug testing on individual patient samples is highly attractive yet is currently hampered by the lack of assays that combine (i) fast reporting, (ii) the ability to inform about immune-mediated responses, (iii) robust quantification, and (iv) scalability for parallel assessment of multiple drugs. Here, we introduce our patient-derived ex vivo drug response assay (PEDRA) that fulfills all these requirements. Using malignant pleural effusions (MPEs) from five non-small cell lung cancer (NSCLC) patients with detailed clinical treatment histories, we tested 52 guideline-recommended therapies and eight investigational antibody-drug conjugates (ADCs). In all patients, PEDRA identified treatment options that outperformed the therapies the patients had received. The results reflected clinical observations as well as expectations derived from mutational profiling and disease courses. To extend the applicability of PEDRA beyond MPEs to other metastatic lesions, we generated a protocol starting from core needle biopsies. Owing to its reproducible and quantitative nature, PEDRA may provide a valuable diagnostic tool to guide time-sensitive clinical therapy decisions. Additionally, PEDRA has great potential for preclinical testing of investigational drugs, thereby reducing the need for animal experiments.

14
Mechanistic learning to predict and understand minimal residual disease

Marzban, S.; Robertson-Tessi, M.; West, J.

2026-04-21 cancer biology 10.64898/2026.04.16.718968 medRxiv
Top 0.1%
6.8%
Show abstract

Mechanistic modeling has long been used as a tool to describe the dynamics of biological systems, especially cancer in response to treatment. Their key advantage lies in interpretability of relationships between input parameters and outcomes of interest. In contrast, machine learning techniques offer strong prediction performance, especially for high dimensional datasets that are common in oncology. Here, we employ a Mechanstic Learning framework that combines the advantages of both approaches by training machine learning models on mechanistic parameters inferred from clinical patient data. The mechanistic model (a Markov chain model) contains sixteen parameters that describe the rate of cell fate transitions that occur in patients with B-cell precursor acute lymphoblastic leukemia. The machine learning (a ridge logistic regression model) is trained on these parameters to predict two clinically-relevant features: BCR::ABL1 fusion gene status (positive or negative) and minimal residual disease status (positive or negative) post-induction chemotherapy. Model training is done in an iterative fashion to assess which (and how many) parameters are critical to maintain high predictive performance. Using machine learning models trained on the clinical flow-cytometry data, we find that the stem-like cell state alone is the most predictive feature for both BCR::ABL1-positive and MRD-positive disease, with combination scores (defined as the average of accuracy, balanced accuracy, and area under the curve) of 0.80 and 0.67, respectively. By comparison, mechanistic learning achieves comparable or improved combination scores for BCR::ABL1-positive and MRD-positive disease, with scores of 0.81 and 0.71, respectively, using only de-differentiation for BCR::ABL1 and primitive-state persistence together with differentiation-directed exit for MRD. Thus, the mechanistic-learning approach not only preserves predictive performance, but also provides a biological hypothesis for why stemness is predictive of these clinically relevant outcomes.

15
Variant-Level Functional Classification of Monoallelic TP53 Mutations Refines Prognostic Stratification in Myelodysplastic Neoplasms Beyond Allelic Status

Streuer, A.; Ochi, Y.; Riabov, V.; Nannya, Y.; Steiner, L.; Abba, M.; Metzgeroth, G.; Altrock, E.; Rapp, F.; Nowak, V.; Hepgueluem, E.; Nowak, D.; Hofmann, W.-K.; Ogawa, S.; Schmitt, N.

2026-03-20 hematology 10.64898/2026.03.18.26348425 medRxiv
Top 0.1%
6.7%
Show abstract

TP53 mutations represent one of the strongest adverse prognostic factors in myelodysplastic neoplasms (MDS). While multi-hit TP53 (TP53multiHit) alterations uniformly lead to very poor outcomes, the prognostic relevance of monoallelic TP53 (TP53mono) mutations remains controversial. TP53 variants can cause loss-of-function, dominant-negative, or gain-of-function effects. We hypothesized that functional heterogeneity among TP53 variants contributes to the variable clinical behavior observed in monoallelic TP53-mutated MDS. Therefore, we analyzed pretreatment samples from 4,505 patients with MDS from two independent cohorts (IWG, n=3,173; J-MDS, n=1,332), including 271 patients with TP53mono and 499 with TP53multiHit. Functional annotation of TP53 variants was performed using a previously published phenotype score (PS) derived from saturation mutagenesis screens, capturing dominant-negative and loss-of-function effects. Median overall survival (OS) differed significantly by TP53 allelic state (TP53 wild-type (TP53wt) 42.4 months; TP53mono 22.9 months; TP53multiHit 9.2 months; p < 0.001). Within the TP53mono subgroup, functional annotation identified marked heterogeneity. Patients with high PS ([&ge;]7) showed significantly inferior OS compared with those with low PS (median OS: 13.8 vs. 39.2 months; HR 1.68, 95% CI 1.16-2.42; p = 0.006), particularly for IPSS-R and IPSS-M low-risk cases. Combining PS and variant allele frequency (VAF) further improved risk stratification. TP53mono patients with PS [&ge;]7 and VAF [&ge;]22% had outcomes comparable to TP53multiHit (median OS: 8.8, p = 0.2), whereas those with PS <7 and VAF <22% exhibited survival similar to TP53wt (median OS: 49.7, p = 0.9). Overall, functional annotation of TP53 variants refines prognostication in TP53mono-mutated MDS and may enhance individualized risk assessment.

16
Deep learning-based non-invasive profiling of tumor transcriptomes from cell-free DNA for precision oncology

Patton, R. D.; Netzley, A.; Persse, T. W.; Nair, A.; Galipeau, P. C.; Coleman, I. M.; Itagi, P.; Chandra, P.; Adil, M.; Vashisth, M.; Sayar, E.; Hiatt, J. B.; Dumpit, R.; Kollath, L.; Demirci, R. A.; Ghodsi, A.; Lam, H.-M.; Morrissey, C.; Iravani, A.; Chen, D. L.; Hsieh, A. C.; MacPherson, D.; Haffner, M. C.; Nelson, P. S.; Ha, G.

2026-02-12 bioinformatics 10.64898/2026.02.10.705188 medRxiv
Top 0.1%
6.5%
Show abstract

Circulating tumor DNA (ctDNA) profiling from liquid biopsies is increasingly adopted as a minimally invasive solution for clinical cancer diagnostic applications. Current methods for inferring gene expression from ctDNA require specialized assays or ultra-deep, targeted sequencing, which preclude transcriptome-wide profiling at single-gene resolution. Herein we jointly introduce Triton, a tool for comprehensive fragmentomic and nucleosome profiling of cell-free DNA (cfDNA), and Proteus, a multi-modal deep learning framework for predicting single gene expression, using standard depth ([~]30-120x) whole genome sequencing of cfDNA. By synthesizing fragmentation and inferred nucleosome positioning patterns in the promoter and gene body from Triton, Proteus reproduced expression profiles using pure ctDNA from patient-derived xenografts (PDX) with an accuracy similar to RNA-Seq technical replicates. Applying Proteus to cfDNA from four patient cohorts with matched tumor RNA-Seq, we show that the model accurately predicted the expression of specific prognostic and phenotype markers and therapeutic targets. As an analog to RNA-Seq, we further confirmed the immediate applicability of Proteus to existing tools through accurate prediction of gene pathway enrichment scores. Our results demonstrate the potential clinical utility of Triton and Proteus as non-invasive tools for precision oncology applications such as cancer monitoring and therapeutic guidance. SubjectsCirculating tumor DNA, liquid biopsies, patient-derived xenografts, whole genome sequencing, deep learning, convolutional neural network, gene expression

17
Systematic Evaluation of Transfer Learning Strategies for Clinical Chemotherapy Response Prediction

Du, H.; Ballester, P.

2026-02-17 bioinformatics 10.64898/2026.02.16.706121 medRxiv
Top 0.1%
6.4%
Show abstract

Accurately predicting chemotherapy response remains a major challenge in precision oncology. Although machine-learning models based on tumour omics data have shown promise, the majority of existing studies are trained and evaluated on pre-clinical cell-line datasets, leaving their clinical applicability insufficiently characterised. In this study, we systematically evaluate a range of transfer-learning strategies for chemotherapy response prediction under realistic clinical constraints using patient data from The Cancer Genome Atlas (TCGA). Rather than proposing a new predictive model, we focus on assessing the effectiveness and limitations of commonly used approaches for transferring pre-clinical knowledge to clinical settings. These include cell-line-validated biomarkers, biologically informed feature representations, direct application of pre-clinical deep-learning models, model fine-tuning, and hybrid strategies that integrate pre-clinical predictions with clinical data. All methods are evaluated within a unified framework using consistent cohort construction, shared performance metrics, and bias-controlled validation procedures. Across multiple drugs and molecular data types, we find that most transfer strategies--including biomarker-based feature selection and direct pre-clinical model transfer--fail to produce robust or consistent improvements in clinical prediction performance. In contrast, conservative approaches based on fine-tuning pre-clinical models or incorporating pre-clinical predictions as features in clinical models yield more stable and reproducible gains. Further improvements are observed when basic pre-treatment clinical variables are integrated. Together, our results demonstrate the practical boundaries of pre-clinical to clinical transfer for drug response prediction and highlight hybrid and fine-tuning strategies as more reliable baselines for future translational modelling efforts.

18
KRAS inhibition is an effective therapy for appendiceal adenocarcinoma

Chowdhury, S.; Ito, I.; Pattalachinti, V. K.; Yousef, A. M.; Yousef, M. M.; Khoury, S. E.; Hornstein, N.; Seldomridge, A. N.; Hong, D.; Overman, M. J.; Taggart, M. W.; Foo, W. C.; Helmink, B.; Fournier, K. F.; Shen, J. P.

2026-04-10 cancer biology 10.64898/2026.04.07.717107 medRxiv
Top 0.1%
6.3%
Show abstract

BackgroundAppendiceal adenocarcinoma (AA) is a rare cancer with limited treatment options. KRAS is the most commonly mutated gene in AA and a promising therapeutic target, but its preclinical and translational relevance in AA remains unclear. MethodsWe evaluated KRASG12D-specific (MRTX1133) and pan-KRAS inhibitor (RMC-6236) in KRASmut organoid and orthotopic PDX models of AA. Tumor-intrinsic and microenvironmental responses were characterized using multi-omics profiling. Clinical outcomes were also assessed in six heavily pre-treated AA patients treated with KRAS inhibitors. ResultsMRTX1133 was highly effective for KRASG12D organoids (IC50=4.1 nM); both KRASG12D and KRASG12V organoids were sensitive to RMC-6236 (IC50=4.4 nM vs 0.5 nM, respectively). In orthotopic PDX models of peritoneal carcinomatosis from AA, MRTX1133 significantly reduced tumor growth in the KRASG12D model TM00351, and RMC-6236 reduced tumor growth in KRASG12V model AAPDX-16. Pathologic evaluation showed dramatically reduced tumor cellularity, proliferation, and pERK expression as well as induction of apoptosis. Gene Sets Enrichment Analysis (GSEA) revealed significant downregulations of E2F targets (NES=-1.9, p-adj=0.06) and the newly developed RAS/ERK (NES=-2.3, p-adj=0.06) gene set, consistent with the observed decrease in cell proliferation. There was marked upregulation of EMT (NES=2.7, FDR<0.001) and TGF-{beta} signaling (NES=2.3, FDR=0.004) in remaining tumor cells, suggesting these pathways could confer resistance. scRNA-seq analysis of TME showed dramatic shifts in cancer-associated fibroblasts (CAFs), with KRAS inhibition driving a shift from normal fibroblasts to inflammatory CAFs, and upregulation of interferon alpha and gamma pathways, suggesting that KRAS inhibition can activate innate immune response in the setting of peritoneal metastases. In a cohort of 6 heavily pre-treated patients with AA treated with KRAS inhibitors (1 G12D, 3 G12C, 2 pan-KRAS), all had biochemical response based on CEA/Ca19-9 or ctDNA and clinical benefit by RECIST criteria (1 CR, 1 PR, 4 SD). ConclusionsWhile effective suppression of RAS/ERK signaling by KRAS inhibitors reduces tumor growth, adaptive activation of EMT and TGF-{beta} pathways may mediate resistance in KRASmut AA. Additionally, KRAS inhibition remodels TME and may enhance innate immune signaling. These findings support continued clinical development of KRAS inhibitors in AA and provide a rationale for combination strategies targeting resistance pathways and stromal remodeling.

19
Discrete Transcriptional States Define Biphasic Immune Response and Dynamic CMS Transitions in Colorectal Cancer

Ishani, K.; Wangmo, D.; Ali, A.; Gates, T.; Yan, Z.; Gustafson, A. P.; Boytim, E.; Storey, K.; Goffredo, P.; Hwang, J.; Subramanian, S.

2026-02-06 cancer biology 10.64898/2026.02.03.703597 medRxiv
Top 0.1%
6.3%
Show abstract

BackgroundSequential alterations in APC, KRAS, TP53, and SMAD4 have been proposed as a framework for colorectal cancer progression. Human colorectal cancer datasets have not revealed the biological transitions associated with these mutations. When examining a cohort of TCGA-colorectal tumors grouped as AK (APC/KRAS), AKP (APC/KRAS/TP53), and AKPS (APC/KRAS/TP53/SMAD4), we observed no significant differences in immune-cell composition, four previously defined Consensus Molecular Subtypes (CMS1/2/3/4), or transcriptomic clustering between these genomic groups. Therefore, these canonical alterations do not sufficiently characterize the known properties of metastatic progression in human colorectal cancer. MethodsTo overcome these limitations, we developed a genetically defined, organoid-based, orthotopic mouse model whereby mouse colon organoids modeling sequential APC, KRAS, TP53, and SMAD4 alterations were orthotopically injected into the colon. This was followed by RNA-sequence processing, normalization with DESeq2, differential expression, pathway enrichment, and immune/stromal inference. Gene co-expression modules were identified from variance-stabilized mouse expression data, mapped to 1:1 human orthologs, and summarized as eigengenes. A multinomial logistic regression model trained on mouse eigengenes was applied to TCGA-COAD human tumors to assign them to mouse-informed transcriptomic states (AK-like, AKP-like, AKPS-like), which were then used for downstream visualization and comparative analyses. ResultsWhole-transcriptome analysis revealed discrete transcriptional states and immune-cell differences between the organoid AK/AKP/AKPS groups. Early TP53 loss led to strong activation of immune pathways, accompanied by increased infiltration of NK and T cells. As tumors progressed with SMAD4 loss and metastasis, this immune activity collapsed, giving rise to broad immune suppression. CMS classifications also shifted, with AK tumors resembling epithelial CMS2, AKP tumors displaying immune-rich CMS1 features, and AKPS and metastatic lesions adopting mesenchymal CMS4 characteristics. We then applied a progression-based transcriptomic classifier to 460 human colorectal tumors. This reclassification revealed conserved immune remodeling, CMS transitions, pathway-level differences, and significant differences in patient survival. ConclusionWe show that organoid-derived progression profiles reveal hidden evolutionary structure in human colorectal cancer and provide a transcriptional framework for interpreting metastatic potential and clinical outcomes.

20
A Context-Aware Target Engagement and Pharmacodynamic Biomarker Resource to Accelerate Drug Discovery and Development

Yang, Y.; Zhao, L.; Orouji, S.; Zhu, Y.; Johnson, R. L.; Maxwell, D. S.; Mica, I.; Russell, K. P.; Al-lazikani, B.

2026-04-22 bioinformatics 10.64898/2026.04.19.719411 medRxiv
Top 0.1%
6.3%
Show abstract

Confirming target engagement in tumor experimental models remains a major challenge in oncology drug development. Pharmacodynamic biomarkers can help address this, but few systematic resources link drug targets to candidate biomarkers. We developed TargetTrace, a comprehensive resource to identify and prioritize pharmacodynamic biomarkers across nine key target classes, including transcription factors/cofactors, kinases, phosphatases, ubiquitin ligases, deubiquitinases, acetyltransferases, deacetylases, methyltransferases, and demethylases. Biomarker candidates were gathered from curated molecular interaction resources and refined using external annotations to improve accuracy. For enzyme targets with measurable substrate changes, we applied a two-agent large language model workflow, followed by manual review, to harmonize antibody information from the antibody resources and ensure that the selected biomarkers are measurable with existing laboratory tests. From more than 92,000 input interactions and over 2,300 targets, we compiled 71,323 target-biomarker relationships involving 2,270 potential drug targets, encompassing both transcription factor/cofactor-target gene and enzyme-substrate interactions. Commercial antibodies were available for over 1,400 biomarkers, supporting laboratory validation. This resource provides a structured and reusable resource for systematic identification and prioritization of pharmacodynamic biomarkers in oncology.